AITopics | nonconvex optimization

Decentralized Matrix Sensing: Statistical Guarantees and Fast Convergence

Neural Information Processing SystemsApr-28-2026, 18:46:02 GMT

We explore the matrix sensing problem from near-isotropic linear measurements, distributed across a network of agents modeled as an undirected graph, with no server. We provide the first study of statistical, computational/communication guarantees for a decentralized gradient algorithm that solves the (nonconvex) Burer-Monteiro type decomposition associated to the low-rank matrix estimation. With small random initialization, the algorithm displays an approximate two-phase convergence: (i) a spectral phase that aligns the iterates' column space with the underlying low-rank matrix, mimicking centralized spectral initialization (not directly implementable over networks); and (ii) a local refinement phase that diverts the iterates from certain degenerate saddle points, while ensuring swift convergence to the underlying low-rank matrix. Central to our analysis is a novel "in-network" Restricted Isometry Property which accommodates for the decentralized nature of the optimization, revealing an intriguing interplay between sample complexity, network connectivity & topology, and communication complexity.

artificial intelligence, machine learning, matrix, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

simple-saddle-camera-version

张辰逸

Neural Information Processing SystemsApr-25-2026, 17:14:13 GMT

Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function f: Rn!R, it outputs an -approximate second-order stationary point in O(logn/ 1.75)iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with O(log4 n/ 2) or O(log6 n/ 1.75) iterations, our algorithm is polynomially better in terms of logn and matches their complexities in terms of 1/ .

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Add feedback

2c8d9636f74d0207ff4f65956010f450-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 06:33:18 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

2c8d9636f74d0207ff4f65956010f450-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 06:33:14 GMT

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Zeroth-Order Methods for Nondifferentiable, Nonconvex, and Hierarchical Federated Optimization

Neural Information Processing SystemsApr-24-2026, 14:57:45 GMT

Federated learning (FL) has emerged as an enabling framework for communicationefficient decentralized training. We study three broadly applicable problem classes in FL: (i) Nondifferentiable nonconvex federated optimization; (ii) Federated bilevel optimization; (iii) Federated minimax problems. Notably, in an implicit sense, both (ii) and (iii) are instances of (i). However, the hierarchical problems in (ii) and (iii) are often complicated by the absence of a closed-form expression for the implicit objective function. Unfortunately, research on these problems has been limited and afflicted by reliance on strong assumptions, including the need for differentiability and L-smoothness of the implicit function. We address this shortcoming by making the following contributions. In (i), by leveraging convolution-based smoothing and Clarke's subdifferential calculus, we devise a randomized smoothing-enabled zeroth-order FL method and derive communication and iteration complexity guarantees for computing an approximate Clarke stationary point. To contend with (ii) and (iii), we devise a unified randomized implicit zeroth-order FL framework, equipped with explicit communication and iteration complexities. Importantly, our method utilizes delays during local steps to skip making calls to the inexact lower-level FL oracle.

artificial intelligence, machine learning, optimization, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Government (0.46)

Technology: